首页> 外文OA文献 >Using Data Mining Methods to Predict Personally Identifiable Information in Emails
【2h】

Using Data Mining Methods to Predict Personally Identifiable Information in Emails

机译:使用数据挖掘方法预测电子邮件中的个人身份信息

代理获取
本网站仅为用户提供外文OA文献查询和代理获取服务,本网站没有原文。下单后我们将采用程序或人工为您竭诚获取高质量的原文,但由于OA文献来源多样且变更频繁,仍可能出现获取不到、文献不完整或与标题不符等情况,如果获取不到我们将提供退款服务。请知悉。

摘要

Private information management and compliance are important issues nowadays for most of organizations. As a major communication tool for organizations, email is one of the many potential sources for privacy leaks. Information extraction methods have been applied to detect private information in text files. However, since email messages usually consist of low quality text, information extraction methods for private information detection may not achieve good performance. In this paper, we address the problem of predicting the presence of private information in email using data mining and text mining methods. Two prediction models are proposed. The first model is based on association rules that predict one type of private information based on other types of private information identified in emails. The second model is based on classification models that predict private information according to the content of the emails. Experiments on the Enron email dataset show promising results.
机译:如今,对于大多数组织而言,私人信息管理和合规性是重要的问题。作为组织的主要通信工具,电子邮件是隐私泄露的许多潜在来源之一。信息提取方法已应用于检测文本文件中的私人信息。但是,由于电子邮件通常包含低质量的文本,因此用于私人信息检测的信息提取方法可能无法获得良好的性能。在本文中,我们解决了使用数据挖掘和文本挖掘方法预测电子邮件中私人信息的存在的问题。提出了两种预测模型。第一模型基于关联规则,该关联规则基于电子邮件中标识的其他类型的私人信息来预测一种类型的私人信息。第二个模型基于分类模型,该分类模型根据电子邮件的内容预测私人信息。在Enron电子邮件数据集上进行的实验显示出令人鼓舞的结果。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
代理获取

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号